R Distribution (rdist) — Symmetric Beta on \([-1, 1]\)#
The R distribution (SciPy: scipy.stats.rdist) is a one-parameter family of symmetric, bounded continuous distributions on \([-1, 1]\).
It is a convenient model whenever:
values are naturally constrained to \([-1, 1]\) (correlations, cosines, normalized similarity scores)
symmetry around 0 is reasonable
you want a single concentration parameter controlling mass near the center vs near the endpoints
A key identity makes it especially approachable:
[ X \sim \mathrm{rdist}(c) \quad\Longleftrightarrow\quad \frac{X+1}{2} \sim \mathrm{Beta}!\left(\frac{c}{2}, \frac{c}{2}\right). ]
That means we can borrow intuition, derivations, and sampling algorithms from the Beta distribution.
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
import os
import plotly.io as pio
import scipy
from scipy import special, stats
pio.templates.default = "plotly_white"
pio.renderers.default = os.environ.get("PLOTLY_RENDERER", "notebook")
np.set_printoptions(precision=6, suppress=True)
rng = np.random.default_rng(42)
print("numpy:", np.__version__)
print("scipy:", scipy.__version__)
numpy: 1.26.2
scipy: 1.15.0
1) Title & Classification#
Distribution name: rdist (R distribution, symmetric beta)
Type: continuous
Support (canonical):
[ -1 \le x \le 1 ]
Parameter space (canonical):
shape parameter: (c > 0)
SciPy parameterization: stats.rdist(c, loc=0, scale=1) uses a location–scale transform
[ Y = \mathrm{loc} + \mathrm{scale},X, \quad \mathrm{scale} > 0, ]
so the support becomes
[ \mathrm{loc} - \mathrm{scale} \le y \le \mathrm{loc} + \mathrm{scale}. ]
2) Intuition & Motivation#
What this distribution models#
rdist models a bounded continuous quantity that is:
constrained to ([-1, 1])
symmetric about 0
optionally U-shaped (more mass near (\pm1)) or bell-shaped (more mass near 0)
Typical real-world use cases#
Correlation-like quantities: priors or phenomenological models for correlation coefficients and similarity scores.
Cosines and angles: if (\theta) is an angle, (\cos\theta \in [-1,1]).
rdistcan model cosine-like measurements.Random directions / projections: if (U) is uniform on the unit sphere (S^{d-1}\subset\mathbb{R}^d), then the first coordinate (U_1) has density proportional to ((1-u^2)^{(d-3)/2}), which is
rdistwith
[ c = d-1. ]
This gives an interpretation of (c) as an “effective dimension”.
Relations to other distributions#
Symmetric Beta: ((X+1)/2 \sim \mathrm{Beta}(c/2,c/2)).
Special cases:
(c=1): arcsine law on ([-1,1]), density (\propto 1/\sqrt{1-x^2})
(c=2): uniform on ([-1,1])
(c=3): semicircle law, density (\propto \sqrt{1-x^2})
Large (c): the distribution concentrates near 0 and is well-approximated by a normal with variance (1/(c+1)) (after matching mean/variance).
3) Formal Definition#
Let (X \sim \mathrm{rdist}(c)) with (c>0). Define (a = c/2).
PDF#
The probability density function is
[ f(x\mid c) = \frac{(1-x^2)^{a-1}}{B!\left(\tfrac12, a\right)} = \frac{(1-x^2)^{\frac{c}{2}-1}}{B!\left(\tfrac12, \tfrac{c}{2}\right)}, \qquad -1\le x\le 1, ]
and (f(x\mid c)=0) for (|x|>1).
Here (B(\cdot,\cdot)) is the Beta function
[ B(p,q) = \int_0^1 t^{p-1}(1-t)^{q-1},dt = \frac{\Gamma(p)\Gamma(q)}{\Gamma(p+q)}. ]
CDF#
Using the symmetric-Beta identity, the CDF can be written via the regularized incomplete beta function (I_z):
[ F(x\mid c) = \mathbb{P}(X\le x) = I_{\frac{x+1}{2}}(a,a), \qquad -1\le x\le 1. ]
Equivalently (making symmetry explicit), for (x\in[-1,1]):
[ F(x\mid c) = \tfrac12 + \tfrac12,\mathrm{sign}(x),I_{x^2}!\left(\tfrac12, a\right). ]
def _check_c(c: float) -> float:
c = float(c)
if not (np.isfinite(c) and c > 0):
raise ValueError("Require a finite shape parameter c > 0.")
return c
def rdist_logpdf(x, c: float):
'''Log-PDF of the canonical rdist(c) on [-1, 1]. Vectorized over x.'''
c = _check_c(c)
a = 0.5 * c
x = np.asarray(x, dtype=float)
out = np.full_like(x, -np.inf, dtype=float)
inside = (x > -1.0) & (x < 1.0)
xi = x[inside]
# log f(x|c) = (a-1) * log(1-x^2) - log B(1/2, a)
out[inside] = (a - 1.0) * np.log1p(-(xi * xi)) - special.betaln(0.5, a)
# Values at exactly x=±1 do not affect probabilities; we special-case c=2 (uniform).
edges = np.isclose(np.abs(x), 1.0)
if np.any(edges) and np.isclose(c, 2.0):
out[edges] = -np.log(2.0)
return out
def rdist_pdf(x, c: float):
return np.exp(rdist_logpdf(x, c))
def rdist_cdf(x, c: float):
'''CDF of the canonical rdist(c) on [-1, 1]. Vectorized over x.'''
c = _check_c(c)
a = 0.5 * c
x = np.asarray(x, dtype=float)
y = np.clip((x + 1.0) / 2.0, 0.0, 1.0)
out = special.betainc(a, a, y)
out = np.where(x <= -1.0, 0.0, out)
out = np.where(x >= 1.0, 1.0, out)
return out
def rdist_rvs_numpy(c: float, size=1, rng=None):
'''NumPy-only sampler for rdist(c) using Beta(a,a) with a=c/2.
Uses the identity:
If G1,G2 ~ Gamma(a,1) iid, then G1/(G1+G2) ~ Beta(a,a).
Then X = 2*Beta(a,a) - 1.
'''
c = _check_c(c)
a = 0.5 * c
rng = np.random.default_rng() if rng is None else rng
g1 = rng.gamma(shape=a, scale=1.0, size=size)
g2 = rng.gamma(shape=a, scale=1.0, size=size)
y = g1 / (g1 + g2)
return 2.0 * y - 1.0
4) Moments & Properties#
Let (X \sim \mathrm{rdist}(c)) and (a=c/2).
Mean, variance, skewness, kurtosis#
By symmetry, all odd moments are 0. In particular:
[ \mathbb{E}[X]=0,\qquad \mathrm{skewness}=0. ]
The second moment (hence variance) is
[ \mathbb{E}[X^2] = \frac{1}{c+1}, \qquad \mathrm{Var}(X)=\frac{1}{c+1}. ]
The fourth moment is
[ \mathbb{E}[X^4] = \frac{3}{(c+1)(c+3)}, ]
which implies excess kurtosis
[ \gamma_2 = \frac{\mathbb{E}[X^4]}{\mathrm{Var}(X)^2} - 3 = -\frac{6}{c+3}. ]
More generally, for (n\ge 0):
[ \mathbb{E}[X^{2n}] = \frac{B!\left(n+\tfrac12, a\right)}{B!\left(\tfrac12, a\right)}, \qquad \mathbb{E}[X^{2n+1}] = 0. ]
MGF and characteristic function#
The moment generating function exists for all real (t):
[ M_X(t) = \mathbb{E}[e^{tX}] = \Gamma!\left(a+\tfrac12\right)\left(\frac{2}{|t|}\right)^{a-\tfrac12} I_{a-\tfrac12}(|t|), \qquad t\ne 0, ]
with (M_X(0)=1). Here (I_\nu) is the modified Bessel function of the first kind.
The characteristic function is
[ \varphi_X(\omega) = \mathbb{E}[e^{i\omega X}] = \Gamma!\left(a+\tfrac12\right)\left(\frac{2}{|\omega|}\right)^{a-\tfrac12} J_{a-\tfrac12}(|\omega|), \qquad \omega\ne 0, ]
with (\varphi_X(0)=1).
Entropy#
Using the Beta connection, the differential entropy is:
[ \mathsf{H}(X) = \log\bigl(2B(a,a)\bigr)
2(a-1),\psi(a)
(2a-2),\psi(2a), ]
where (\psi) is the digamma function.
def rdist_mean(c: float) -> float:
_check_c(c)
return 0.0
def rdist_var(c):
c = np.asarray(c, dtype=float)
if np.any(~np.isfinite(c)) or np.any(c <= 0):
raise ValueError("Require finite c > 0.")
return 1.0 / (c + 1.0)
def rdist_excess_kurtosis(c: float) -> float:
c = _check_c(c)
return -6.0 / (c + 3.0)
def rdist_even_moment(n: int, c: float) -> float:
'''Return E[X^(2n)] for X ~ rdist(c) (n >= 0).'''
c = _check_c(c)
n = int(n)
if n < 0:
raise ValueError("Require n >= 0.")
a = 0.5 * c
return float(np.exp(special.betaln(n + 0.5, a) - special.betaln(0.5, a)))
def rdist_mgf(t, c: float):
'''MGF M_X(t) using the Bessel-I representation (real, even in t).'''
c = _check_c(c)
a = 0.5 * c
nu = a - 0.5
t = np.asarray(t, dtype=float)
out = np.ones_like(t, dtype=float)
mask = t != 0
abs_t = np.abs(t[mask])
out[mask] = special.gamma(a + 0.5) * (2.0 / abs_t) ** nu * special.iv(nu, abs_t)
return out
def rdist_cf(w, c: float):
'''Characteristic function φ_X(ω) using the Bessel-J representation (real, even in ω).'''
c = _check_c(c)
a = 0.5 * c
nu = a - 0.5
w = np.asarray(w, dtype=float)
out = np.ones_like(w, dtype=float)
mask = w != 0
abs_w = np.abs(w[mask])
out[mask] = special.gamma(a + 0.5) * (2.0 / abs_w) ** nu * special.jv(nu, abs_w)
return out
def rdist_entropy(c: float) -> float:
'''Differential entropy of rdist(c) on [-1,1].'''
c = _check_c(c)
a = 0.5 * c
return float(
np.log(2.0)
+ special.betaln(a, a)
- 2.0 * (a - 1.0) * special.digamma(a)
+ (2.0 * a - 2.0) * special.digamma(2.0 * a)
)
# Quick numerical cross-check against SciPy
c0 = 5.0
x0 = np.linspace(-0.999, 0.999, 7)
print("max |pdf - scipy|:", np.max(np.abs(rdist_pdf(x0, c0) - stats.rdist.pdf(x0, c0))))
print("max |cdf - scipy|:", np.max(np.abs(rdist_cdf(x0, c0) - stats.rdist.cdf(x0, c0))))
print("mean,var,skew,kurt(excess) (formula):", rdist_mean(c0), rdist_var(c0), 0.0, rdist_excess_kurtosis(c0))
print("mean,var,skew,kurt(excess) (scipy): ", stats.rdist.stats(c0, moments="mvsk"))
print("entropy (formula):", rdist_entropy(c0))
print("entropy (scipy): ", stats.rdist.entropy(c0))
# Monte Carlo check
n_mc = 200_000
s = rdist_rvs_numpy(c0, size=n_mc, rng=rng)
print("MC mean/var:", s.mean(), s.var(ddof=0))
print("MGF(t=1) formula vs MC:", float(rdist_mgf(1.0, c0)), float(np.mean(np.exp(1.0 * s))))
max |pdf - scipy|: 2.220446049250313e-16
max |cdf - scipy|: 0.0
mean,var,skew,kurt(excess) (formula): 0.0 0.16666666666666666 0.0 -0.75
mean,var,skew,kurt(excess) (scipy): (0.0, 0.16666666666666663, 0.0, -0.7499999999999991)
entropy (formula): 0.49334217451750995
entropy (scipy): 0.4933421745175011
MC mean/var: -0.0004876065533899298 0.1668664435284834
MGF(t=1) formula vs MC: 1.0859813581363065 1.0855688443043081
5) Parameter Interpretation#
The single shape parameter (c) controls where the mass lives:
(0 < c < 2): (\tfrac{c}{2}-1 < 0) so ((1-x^2)^{\frac{c}{2}-1}) blows up near (x=\pm 1). The distribution is U-shaped.
(c = 2): exponent is 0, so the density is constant. This is Uniform((-1,1)).
(c > 2): exponent is positive, so the density goes to 0 at (\pm 1) and peaks at 0. The distribution is bell-shaped.
A useful quantitative summary is the variance:
[ \mathrm{Var}(X) = \frac{1}{c+1}. ]
Larger (c) means smaller variance and stronger concentration around 0.
# How the PDF changes with c
cs = [0.5, 1.0, 2.0, 3.0, 10.0]
eps = 1e-4
x = np.linspace(-1 + eps, 1 - eps, 1500)
fig = go.Figure()
for c in cs:
fig.add_trace(go.Scatter(x=x, y=rdist_pdf(x, c), name=f"c={c}"))
fig.update_layout(
title="R distribution PDF for different c",
xaxis_title="x",
yaxis_title="density",
)
fig.show()
# Variance as a function of c
c_grid = np.linspace(0.2, 20, 200)
fig2 = px.line(
x=c_grid,
y=rdist_var(c_grid),
labels={"x": "c", "y": "Var(X)"},
title="Variance Var(X)=1/(c+1)",
)
fig2.show()
6) Derivations#
We sketch three core derivations in the canonical ([-1,1]) parameterization.
6.1 Expectation#
The density satisfies (f(x\mid c)=f(-x\mid c)). Therefore the integrand (x f(x\mid c)) is odd, so
[ \mathbb{E}[X] = \int_{-1}^1 x f(x\mid c),dx = 0. ]
6.2 Variance#
Because (\mathbb{E}[X]=0), (\mathrm{Var}(X)=\mathbb{E}[X^2]). Write (a=c/2).
[ \mathbb{E}[X^2] = \frac{1}{B(\tfrac12,a)}\int_{-1}^1 x^2 (1-x^2)^{a-1},dx. ]
Use symmetry and substitute (u=x^2):
[ \begin{aligned} \int_{-1}^1 x^2 (1-x^2)^{a-1},dx &= 2\int_0^1 x^2 (1-x^2)^{a-1},dx \ &= 2\int_0^1 u,(1-u)^{a-1},\frac{1}{2\sqrt{u}},du \ &= \int_0^1 u^{\frac32-1}(1-u)^{a-1},du \ &= B!\left(\tfrac32,a\right). \end{aligned} ]
So
[ \mathbb{E}[X^2] = \frac{B(\tfrac32,a)}{B(\tfrac12,a)} = \frac{1}{2a+1} = \frac{1}{c+1}. ]
6.3 Likelihood (i.i.d. sample)#
Given i.i.d. data (x_1,\dots,x_n\in(-1,1)), the likelihood for (c) is
[ L(c) = \prod_{i=1}^n \frac{(1-x_i^2)^{\frac{c}{2}-1}}{B(\tfrac12,\tfrac{c}{2})}. ]
The log-likelihood is
[ \ell(c) = \left(\frac{c}{2}-1\right)\sum_{i=1}^n \log(1-x_i^2)
n,\log B!\left(\tfrac12,\tfrac{c}{2}\right). ]
Differentiating (using (\frac{d}{dz}\log\Gamma(z)=\psi(z))) gives the score equation
[ 0 = \ell’(c) = \frac12\sum_{i=1}^n \log(1-x_i^2)
\frac{n}{2}\Bigl(\psi(\tfrac{c}{2}) - \psi(\tfrac{c+1}{2})\Bigr), ]
which is typically solved numerically for the MLE (\hat c).
def rdist_loglik(c: float, x: np.ndarray) -> float:
c = _check_c(c)
x = np.asarray(x, dtype=float)
if np.any((x <= -1.0) | (x >= 1.0)):
return -np.inf
return float(np.sum(rdist_logpdf(x, c)))
def rdist_mom_c(x: np.ndarray) -> float:
'''Method-of-moments estimate from Var(X)=1/(c+1).'''
x = np.asarray(x, dtype=float)
v = float(np.var(x, ddof=0))
if v <= 0:
return np.inf
return max(1e-8, 1.0 / v - 1.0)
# Compare MOM vs SciPy's MLE fit on synthetic data
c_true = 8.0
n = 4000
x = stats.rdist.rvs(c_true, size=n, random_state=rng)
c_mom = rdist_mom_c(x)
(c_mle, loc_mle, scale_mle) = stats.rdist.fit(x, floc=0, fscale=1)
print("true c:", c_true)
print("MOM c :", c_mom)
print("MLE c :", c_mle)
print("loglik(c_true):", rdist_loglik(c_true, x))
print("loglik(c_mom): ", rdist_loglik(c_mom, x))
print("loglik(c_mle): ", rdist_loglik(c_mle, x))
true c: 8.0
MOM c : 7.837035002401219
MLE c : 7.830761718750015
loglik(c_true): -1271.0385836930934
loglik(c_mom): -1270.5214543060163
loglik(c_mle): -1270.5207357833792
7) Sampling & Simulation#
NumPy-only sampler#
Using ((X+1)/2 \sim \mathrm{Beta}(a,a)) with (a=c/2), we can sample from rdist using only NumPy:
Sample (G_1, G_2 \overset{iid}{\sim} \mathrm{Gamma}(a,1))
Form (B = G_1/(G_1+G_2)) so (B \sim \mathrm{Beta}(a,a))
Return (X = 2B - 1)
This avoids acceptance–rejection and is numerically stable for a wide range of (c).
# Sampling demo
c_demo = 1.5
samples = rdist_rvs_numpy(c_demo, size=10, rng=rng)
print(samples)
# Basic sanity: samples lie in [-1,1]
print("min/max:", samples.min(), samples.max())
[-0.687984 -0.570477 0.755541 -0.147472 0.250969 -0.816589 0.7435
-0.996741 -0.828464 0.881516]
min/max: -0.9967410604892193 0.8815164217779459
8) Visualization#
We’ll visualize:
the PDF for a few (c) values
the CDF
a Monte Carlo histogram compared to the theoretical density
# PDF and CDF for a chosen c
c_vis = 5.0
x = np.linspace(-1 + 1e-4, 1 - 1e-4, 1500)
fig_pdf = go.Figure()
fig_pdf.add_trace(go.Scatter(x=x, y=rdist_pdf(x, c_vis), name="pdf (formula)"))
fig_pdf.add_trace(go.Scatter(x=x, y=stats.rdist.pdf(x, c_vis), name="pdf (scipy)", line=dict(dash="dash")))
fig_pdf.update_layout(title=f"PDF for c={c_vis}", xaxis_title="x", yaxis_title="density")
fig_pdf.show()
fig_cdf = go.Figure()
fig_cdf.add_trace(go.Scatter(x=x, y=rdist_cdf(x, c_vis), name="cdf (formula)"))
fig_cdf.add_trace(go.Scatter(x=x, y=stats.rdist.cdf(x, c_vis), name="cdf (scipy)", line=dict(dash="dash")))
fig_cdf.update_layout(title=f"CDF for c={c_vis}", xaxis_title="x", yaxis_title="F(x)")
fig_cdf.show()
# Monte Carlo histogram
n_hist = 80_000
s = rdist_rvs_numpy(c_vis, size=n_hist, rng=rng)
fig_hist = go.Figure()
fig_hist.add_trace(
go.Histogram(
x=s,
nbinsx=90,
histnorm="probability density",
name="samples (hist)",
opacity=0.6,
)
)
fig_hist.add_trace(go.Scatter(x=x, y=rdist_pdf(x, c_vis), name="theoretical pdf", line=dict(color="black")))
fig_hist.update_layout(
title=f"Monte Carlo samples vs pdf (c={c_vis})",
xaxis_title="x",
yaxis_title="density",
barmode="overlay",
)
fig_hist.show()
9) SciPy Integration#
SciPy provides rdist as scipy.stats.rdist.
stats.rdist.pdf(x, c, loc=..., scale=...)stats.rdist.cdf(x, c, loc=..., scale=...)stats.rdist.rvs(c, size=..., random_state=...)stats.rdist.fit(data, ...)(MLE by default)
Because rdist is a symmetric beta distribution, you can also work with stats.beta on ([0,1]) and transform via (x=2y-1).
# Basic SciPy usage
c = 3.0
x = np.linspace(-0.999, 0.999, 9)
print("pdf:", stats.rdist.pdf(x, c))
print("cdf:", stats.rdist.cdf(x, c))
print("rvs:", stats.rdist.rvs(c, size=5, random_state=rng))
# Fit example: recover c when loc/scale are known
c_true = 6.0
data = stats.rdist.rvs(c_true, size=3000, random_state=rng)
(c_hat, loc_hat, scale_hat) = stats.rdist.fit(data, floc=0, fscale=1)
print("true c:", c_true)
print("fit c:", c_hat)
# Identity check: (X+1)/2 ~ Beta(c/2, c/2)
y = (data + 1) / 2
pdf_beta = stats.beta.pdf((x + 1) / 2, a=c_true / 2, b=c_true / 2) / 2 # Jacobian factor
pdf_r = stats.rdist.pdf(x, c_true)
print("max |pdf_r - transformed_beta_pdf|:", np.max(np.abs(pdf_r - pdf_beta)))
pdf: [0.028463 0.421625 0.551513 0.616446 0.63662 0.616446 0.551513 0.421625
0.028463]
cdf: [0.000019 0.072463 0.195777 0.342673 0.5 0.657327 0.804223 0.927537
0.999981]
rvs: [ 0.022809 -0.961042 -0.883764 -0.258481 -0.646669]
true c: 6.0
fit c: 6.054687500000011
max |pdf_r - transformed_beta_pdf|: 3.3306690738754696e-16
10) Statistical Use Cases#
10.1 Hypothesis testing#
Because rdist includes the uniform as (c=2), a simple test is:
(H_0): the data are uniform on ([-1,1]) (i.e. (c=2))
(H_1): (c\ne 2)
A likelihood ratio test uses
[ \Lambda = 2\bigl(\ell(\hat c) - \ell(2)\bigr), ]
which is approximately (\chi^2_1) under standard regularity assumptions (use with care in small samples).
10.2 Bayesian modeling#
rdist can be used as a prior over a symmetric bounded parameter, e.g. a correlation-like quantity (\rho\in[-1,1]). Larger (c) yields a prior more concentrated near 0.
10.3 Generative modeling#
If you need to generate bounded features or angles/cosines, rdist is a simple building block.
A particularly neat generative connection:
sample a random direction uniformly on the sphere (S^{d-1})
the first coordinate has distribution
rdist(c=d-1)
So rdist can generate realistic “random projection” coordinates without generating the full vector.
# 10.1 Likelihood ratio test: H0 c=2 (uniform) vs H1 c!=2
def rdist_loglik_scipy(c: float, x: np.ndarray) -> float:
return float(np.sum(stats.rdist.logpdf(x, c)))
c_true = 5.0
n = 2000
x = stats.rdist.rvs(c_true, size=n, random_state=rng)
c_hat, _, _ = stats.rdist.fit(x, floc=0, fscale=1)
ll_hat = rdist_loglik_scipy(c_hat, x)
ll_null = rdist_loglik_scipy(2.0, x)
lr_stat = 2.0 * (ll_hat - ll_null)
p_value = stats.chi2.sf(lr_stat, df=1)
print("true c:", c_true)
print("MLE c:", c_hat)
print("LR stat:", lr_stat)
print("approx p-value (chi^2_1):", p_value)
true c: 5.0
MLE c: 5.0624023437500085
LR stat: 817.0196089693518
approx p-value (chi^2_1): 1.0758209264796379e-179
# 10.2 Bayesian modeling (simple grid posterior)
# Prior: rho ~ rdist(c_prior)
# Likelihood: y | rho ~ Normal(rho, sigma^2)
r = np.linspace(-1 + 1e-4, 1 - 1e-4, 2000)
y_obs = 0.55
sigma = 0.12
fig = go.Figure()
for c_prior in [1.0, 5.0, 20.0]:
prior = rdist_pdf(r, c_prior)
like = stats.norm.pdf(y_obs, loc=r, scale=sigma)
post_unnorm = prior * like
post = post_unnorm / np.trapz(post_unnorm, r)
fig.add_trace(go.Scatter(x=r, y=prior, name=f"prior c={c_prior}", line=dict(dash="dash")))
fig.add_trace(go.Scatter(x=r, y=post, name=f"posterior c={c_prior}"))
fig.update_layout(
title=f"Grid posterior for rho in [-1,1] (y={y_obs}, sigma={sigma})",
xaxis_title="rho",
yaxis_title="density",
)
fig.show()
# 10.3 Generative modeling: coordinate of a random direction on S^{d-1}
d = 10
c_sphere = d - 1
n = 120_000
z = rng.normal(size=(n, d))
z = z / np.linalg.norm(z, axis=1, keepdims=True)
x1 = z[:, 0]
x_grid = np.linspace(-1 + 1e-4, 1 - 1e-4, 1500)
fig = go.Figure()
fig.add_trace(
go.Histogram(
x=x1,
nbinsx=90,
histnorm="probability density",
name=f"sphere coord (d={d})",
opacity=0.6,
)
)
fig.add_trace(
go.Scatter(
x=x_grid,
y=stats.rdist.pdf(x_grid, c_sphere),
name=f"rdist(c={c_sphere}) pdf",
line=dict(color="black"),
)
)
fig.update_layout(
title="First coordinate of a random unit vector matches rdist",
xaxis_title="x1",
yaxis_title="density",
barmode="overlay",
)
fig.show()
11) Pitfalls#
Invalid parameters: require (c>0). In SciPy, also require
scale > 0.Endpoint behavior: for (c<2), the density diverges as (|x|\to 1). This is integrable, but plotting or evaluating exactly at (x=\pm 1) can produce
inf.Numerical stability: for large (c) or (|x|) near 1, compute in log-space (use
logpdf,betaln,log1p).Fitting: unconstrained
fitmay try to moveloc/scale. If you know data should live on ([-1,1]), usefloc=0, fscale=1.Out-of-support data: any (|x|>1) breaks the model; normalize/clip only if you can justify it scientifically.
# Numerical stability: pdf vs logpdf near the boundary for large c
c_big = 50.0
x_edge = np.array([0.0, 0.9, 0.99, 0.999, 0.9999])
pdf_direct = rdist_pdf(x_edge, c_big)
logpdf = rdist_logpdf(x_edge, c_big)
print("x:", x_edge)
print("pdf:", pdf_direct)
print("logpdf:", logpdf)
print("exp(logpdf):", np.exp(logpdf))
x: [0. 0.9 0.99 0.999 0.9999]
pdf: [2.806879 0. 0. 0. 0. ]
logpdf: [ 1.032073 -38.825476 -92.97678 -148.130524 -203.381763]
exp(logpdf): [2.806879 0. 0. 0. 0. ]
12) Summary#
rdistis a continuous, symmetric distribution on ([-1,1]) with shape parameter (c>0).PDF: (f(x\mid c)=(1-x^2)^{c/2-1}/B(1/2,c/2)).
CDF: (F(x\mid c)=I_{(x+1)/2}(c/2,c/2)).
Mean (0), variance (1/(c+1)), excess kurtosis (-6/(c+3)); odd moments vanish.
Sampling is easy via the Beta identity: draw (B\sim\mathrm{Beta}(c/2,c/2)), return (X=2B-1).
In SciPy:
stats.rdist.pdf,cdf,rvs, andfitcover most workflows.